Skip to content

Arch refactor#251

Merged
miiyakumo merged 6 commits into
mainfrom
arch-refactor
May 15, 2026
Merged

Arch refactor#251
miiyakumo merged 6 commits into
mainfrom
arch-refactor

Conversation

@miiyakumo

Copy link
Copy Markdown
Contributor

This pull request introduces several significant refactorings and improvements to the architecture abstraction and address handling in the kernel, with a focus on type safety, modularity, and code clarity. The changes include a major overhaul of the address API to use strongly-typed user addresses, improved user/kernel memory copy validation, and the separation of platform-specific operations from architecture traits. Additionally, architecture-specific ABI constants and relocation handling are now centralized for better maintainability.

Address API and Type Safety Improvements:

  • Replaced all usize user address arguments in the Arch trait and related implementations with the strongly-typed UA (user address) type, improving type safety for user/kernel memory operations. (os/src/arch/arch.rs, os/src/arch/arch_impl.rs) [1] [2] [3] [4] [5] [6]
  • Refactored the Address API: removed the TPA/TVA type aliases, renamed add/sub to add_bytes/sub_bytes, and generalized pointer conversion methods (as_ptr, as_mut_ptr, as_ref, as_mut) to allow explicit type conversion. Added from_ref and from_ptr constructors for virtual addresses. (os/src/arch/address.rs) [1] [2] [3] [4] [5]
  • Removed the AddressTranslator trait, simplifying cross-space address translation logic. (os/src/arch/address.rs)

User/Kernel Memory Copy Validation:

  • Added comprehensive range and page table flag validation for all user/kernel memory copy operations, ensuring that all accesses are within valid user space and have correct permissions. This validation is now performed in a single helper function used by all relevant methods. (os/src/arch/arch_impl.rs)

Architecture and Platform Abstraction Refactoring:

  • Moved platform-level operations (console I/O, power management, address mapping) out of the Arch trait into a new Platform trait, and provided a macro (impl_platform!) for easy implementation. Updated documentation to reflect this separation. (os/src/arch/arch.rs, os/src/arch/arch_impl.rs) [1] [2] [3]

Architecture-Specific ABI and ELF Handling:

  • Added a new module (os/src/arch/abi.rs) containing architecture-specific ELF constants, relocation classification, and /proc/cpuinfo content generation. This centralizes ABI details and relocation logic for RISC-V and LoongArch targets. (os/src/arch/abi.rs)

Build and Configuration Adjustments:

  • Simplified build.rs logic: always creates a real ext4 image for embedding, removing the need for test-mode detection and dummy images. Updated comments in .cargo/config.toml to clarify when build-std should be used. (os/build.rs, os/.cargo/config.toml) [1] [2]

miiyakumo added 6 commits May 15, 2026 15:34
…dability

- Added a new `boot` module in `mod.rs` to encapsulate boot-related functionality.
- Reorganized imports in `scheduler/mod.rs` for clarity and consistency.
- Enhanced syscall dispatch macros in `dispatch.rs` for better formatting and readability.
- Moved `read_from_user` and `write_to_user` imports in `fcntl.rs` for consistency.
- Cleaned up unused imports in `fs.rs` and ensured proper usage of `write_to_user`.
- Standardized unsafe block formatting in `io.rs` for better readability.
- Consolidated user buffer operations in `user_buffer.rs` for clarity.
- Improved memory space management in `mapping_area.rs` with clearer unsafe operations.
- Refined network socket handling in `socket.rs` for better error handling and readability.
- Updated test module imports for consistency with architecture changes.
- Adjusted various modules to ensure proper ordering and organization of imports.
- Replaced `write()` with `lock()` in various places to ensure proper locking mechanisms are used for thread safety.
- Updated task sleeping and waking functions to remove blocking variants, simplifying the API and improving clarity.
- Enhanced the `sleep_task` and `wake_up_task` functions to handle task states more efficiently.
- Removed the `RawSpinLockWithoutGuard` and `TicketLock` implementations to streamline synchronization primitives.
- Introduced a new `IntrGuard` implementation that supports nested interrupt protection, enhancing interrupt handling.
- Updated device driver initialization to use locking mechanisms consistently.
- Improved signal handling in the task management system to ensure proper task state transitions.
- Introduced a new `task` module to encapsulate architecture-neutral task setup data, including `ExecStackLayout`.
- Updated the `trap_frame` implementations for both LoongArch and RISC-V to utilize the new `set_exec_trap_frame_from_layout` method.
- Refactored memory space handling in ELF loading to support architecture-agnostic relocation resolution.
- Enhanced user stack layout setup for `execve` across architectures, consolidating logic and improving readability.
- Removed outdated LoongArch-specific memory utility functions in favor of architecture-agnostic implementations.
- Updated syscall numbers to use architecture-specific constants from the new `abi` module.
- Improved error handling and assertions in memory space operations to ensure compatibility with multiple architectures.
- Renamed `Paddr` to `PA` and `Vaddr` to `VA` across the memory management module for consistency.
- Updated frame allocator and memory space functions to use the new address types.
- Modified address conversion functions to align with the new naming conventions.
- Adjusted user buffer handling to utilize the new `UA` type for user addresses.
- Ensured all related tests and usages are updated to reflect these changes, maintaining functionality.
@miiyakumo miiyakumo merged commit dd374e4 into main May 15, 2026
1 check failed
@miiyakumo miiyakumo deleted the arch-refactor branch May 15, 2026 13:52

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the architecture abstraction layer to better support RISC-V and LoongArch, introducing a Platform trait for board-specific logic and a shared kernel::boot module. It also standardizes memory address types and enhances synchronization primitives, such as adding nesting support to IntrGuard. Technical feedback identifies a race condition in IntrGuard::new regarding the order of interrupt disabling and CPU ID retrieval, as well as potential undefined behavior in read_from_user when copy operations fail. Further concerns include a deadlock risk in validate_user_copy_range due to lock acquisition patterns, the loss of asynchronous I/O notifications in pipes, and performance inefficiencies in clear_bss and LoongArch's memcpy implementations.

Comment thread os/src/sync/intr_guard.rs
Comment on lines +56 to +64
let cpu_id = CPU::id();
let depth = INTR_DEPTH[cpu_id].0.load(Ordering::Relaxed);
let was_nested = depth > 0;
if !was_nested {
let flags = CPU::disable_interrupts();
SAVED_INTR_FLAGS[cpu_id].0.store(flags, Ordering::Relaxed);
}
INTR_DEPTH[cpu_id].0.fetch_add(1, Ordering::Relaxed);
core::sync::atomic::fence(Ordering::Acquire);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of IntrGuard::new has a potential race condition and instability issue. It retrieves the cpu_id and checks the nesting depth before disabling interrupts. If the task is preempted and migrated to another CPU between CPU::id() and CPU::disable_interrupts(), it will operate on the wrong CPU's state. Additionally, an interrupt occurring between the check and the disable could lead to inconsistent nesting state. Interrupts should be disabled first to pin the task to the current CPU and ensure atomicity of the state update, maintaining consistency with existing architecture implementations.

References
  1. Prioritize consistency with existing architecture implementations (like RISC-V) when adding support for a new architecture (like LoongArch).

crate::arch::ArchImpl::copy_from_user(
user_ptr as usize,
UA::from_usize(user_ptr as usize),
val.as_mut_ptr() as *mut u8,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The read_from_user function ignores the result of copy_from_user. If the copy fails (e.g., due to an invalid user pointer), the function will proceed to call assume_init() on uninitialized memory, leading to undefined behavior. This function should ideally return a Result or Option, or handle the error safely.

Comment thread os/src/kernel/boot.rs
Comment on lines +50 to +58
let sbss_paddr =
unsafe { crate::arch::va_to_pa(crate::arch::address::VA::from_usize(sbss as usize)) };
let ebss_paddr =
unsafe { crate::arch::va_to_pa(crate::arch::address::VA::from_usize(ebss as usize)) };

(sbss_paddr.as_usize()..ebss_paddr.as_usize()).for_each(|a| unsafe {
let va = crate::arch::pa_to_va(crate::arch::address::PA::from_usize(a));
(va.as_usize() as *mut u8).write_volatile(0)
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The clear_bss implementation is highly inefficient. It iterates byte-by-byte and performs two architecture-specific address translations (va_to_pa and pa_to_va) for every single byte. Since BSS is typically contiguous in the kernel's virtual address space, it would be much faster to use a bulk zeroing operation like core::ptr::write_bytes on the virtual range directly.

Comment on lines +7 to +14
pub unsafe extern "C" fn memcpy(dst: *mut u8, src: *const u8, n: usize) -> *mut u8 {
for i in 0..n {
unsafe {
ptr::write(dst.add(i), ptr::read(src.add(i)));
}
}
dst
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The memcpy implementation for LoongArch uses a byte-by-byte copy loop, which is inefficient for large buffers. To maintain consistency with existing architecture implementations like RISC-V, consider using word-sized copies (e.g., usize) where possible to improve performance and ensure architectural parity across the repository.

References
  1. Prioritize consistency with existing architecture implementations (like RISC-V) when adding support for a new architecture (like LoongArch).

crate::kernel::syscall::io::wake_poll_waiters();
}
result
self.buffer.lock().read(buf)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The refactoring removed the calls to wake_poll_waiters() in the read and write methods. If the system relies on these manual wakeups to notify tasks blocked on poll or select about pipe activity (e.g., space becoming available in the buffer), this change will break asynchronous I/O for pipes. Please ensure that pipe event notification is handled elsewhere.

Comment thread os/src/arch/arch_impl.rs
}

let space = $crate::kernel::current_memory_space();
let guard = space.lock();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The validate_user_copy_range function acquires a lock on the current memory space. Since this is called within copy_from_user and related methods, there is a risk of deadlock if these methods are invoked while the MemorySpace lock is already held by the current task (e.g., during certain VM operations or complex syscalls).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant